Modelling Data Dispersion Degree in Automatic Robust Estimation for Multivariate Gaussian Mixture Models with an Application to Noisy Speech Processing

نویسندگان

  • Dalei Wu
  • Haiqing Wu
چکیده

The trimming scheme with a prefixed cutoff portion is known as a method of improving the robustness of statistical models such as multivariate Gaussian mixture models (MGMMs) in small scale tests by alleviating the impacts of outliers. However, when this method is applied to realworld data, such as noisy speech processing, it is hard to know the optimal cut-off portion to remove the outliers and sometimes removes useful data samples as well. In this paper, we propose a new method based on measuring the dispersion degree (DD) of the training data to avoid this problem, so as to realise automatic robust estimation for MGMMs. The DD model is studied by using two different measures. For each one, we theoretically prove that the DD of the data samples in a context of MGMMs approximately obeys a specific (chi or chi-square) distribution. The proposed method is evaluated on a real-world application with a moderately-sized speaker recognition task. Experiments show that the proposed method can significantly improve the robustness of the conventional training method of GMMs for speaker recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering

Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Robust Estimation and Adaptation of Subspace Gaussian Mixture Models for Automatic Speech Recognition

In conventional hidden Markov model (HMM) based speech recognisers, the emitting HMM states are modelled by Gaussian Mixture Models (GMMs), with parameters been estimated directly from the training data. However, in Subspace Gaussian mixture model(GMM) based acoustic modelling, the parameters of each state model are derived from the globally shared model subspaces which are normally low dimensi...

متن کامل

Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty

In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE  estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of  noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1405.4599  شماره 

صفحات  -

تاریخ انتشار 2013